Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more unittest #304

Merged
merged 30 commits into from
Jun 26, 2024
Merged

Add more unittest #304

merged 30 commits into from
Jun 26, 2024

Conversation

pan-x-c
Copy link
Collaborator

@pan-x-c pan-x-c commented Apr 24, 2024

  1. setup local actions runner on a GPU machine
  2. use docker-compose to setup a cluster
  3. add more unit tests
    • single process
    • multi process
    • multi process with GPU
    • ray with CPU
    • ray with GPU

@pan-x-c pan-x-c added the enhancement New feature or request label Apr 24, 2024
@pan-x-c pan-x-c self-assigned this Apr 24, 2024
@pan-x-c pan-x-c changed the title [WIP] Add more unittest Add more unittest May 7, 2024
Copy link

This PR is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this PR will be closed in 3 day.

.github/workflows/docker/docker-compose.yml Show resolved Hide resolved
data_juicer/utils/unittest_utils.py Show resolved Hide resolved
.github/workflows/unittest.yml Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the stale-pr label May 31, 2024
@yxdyc yxdyc merged commit c749a28 into main Jun 26, 2024
3 checks passed
yxdyc pushed a commit that referenced this pull request Jul 17, 2024
* modelscope-sora news (#323)

* News/modelscope sora (#327)

* modelscope-sora news

* remove empower

* debug for gpu rank for analyser (#329)

* debug for gpu rank for analyser

* spec_numprocs -> num_proc

* Add more unittest  (#304)

* add unittest env with gpu

* fix unittest yml

* add environment for unittest

* update workflow trigger

* update install step

* fix install command

* update working dir

* update container

* update working dir

* change working directory

* change working directory

* change working directory

* change working directory

* change unittest

* use test tag

* finish tag support

* support run op with different executro

* fix pre-commit

* add hf mirror

* add hf mirror

* run all test in standalone mode by default

* ignore image face ratio

* update tags

* add ray testcase

* add ray test in workflow

* update ray unittest workflow

* delete old unittest

---------

Co-authored-by: root <panxuchen>

* Add source tag (#317)

* add source tag for some mapper op

* fix no attribute 'current_tag' when executing local tests

* move op process logic from executor to base op

* fix typo

* move export outside op

* init refactor

* update analyser

* fix format

* clean up

* bring back batch mapper

* Improve fault tolerance & Fix Ray executor

* fix wrapper

* fix batched filter

* Remove use_actor as it is not compatible with the refactored OP clas, unless the dataset class is refactored

* make wrappers work with unittests

* Compatible with unit tests and works with ray

* fix unittest

* fix wrappers with ray, map, filter

* unify unittests

* wrap deduplicators

* Compatible with non-batched calls

* Class-level wrappers

- compatible with dataset.filter
- bring back nested wrappers

* Instance-level wrappers

* Refined instance-level wrappers

- Remove incomplete dataset.filter wrappers
- Simplify code
- Stack wrappers

* fix use_cuda

* Refactor dataset (#348)

* refactor dataset

* update unittest with DJDataset

* fix unittest

* update ray data load

* add test

* ray read json

* update docker image version

* actor is no longer supported

* Regress filter's stats export logic

---------

Co-authored-by: BeachWang <[email protected]>
Co-authored-by: Xuchen Pan <[email protected]>
Co-authored-by: chenhesen <[email protected]>
Co-authored-by: garyzhang99 <[email protected]>
yxdyc added a commit that referenced this pull request Jul 18, 2024
* Refactor OP & Dataset (#336)

* modelscope-sora news (#323)

* News/modelscope sora (#327)

* modelscope-sora news

* remove empower

* debug for gpu rank for analyser (#329)

* debug for gpu rank for analyser

* spec_numprocs -> num_proc

* Add more unittest  (#304)

* add unittest env with gpu

* fix unittest yml

* add environment for unittest

* update workflow trigger

* update install step

* fix install command

* update working dir

* update container

* update working dir

* change working directory

* change working directory

* change working directory

* change working directory

* change unittest

* use test tag

* finish tag support

* support run op with different executro

* fix pre-commit

* add hf mirror

* add hf mirror

* run all test in standalone mode by default

* ignore image face ratio

* update tags

* add ray testcase

* add ray test in workflow

* update ray unittest workflow

* delete old unittest

---------

Co-authored-by: root <panxuchen>

* Add source tag (#317)

* add source tag for some mapper op

* fix no attribute 'current_tag' when executing local tests

* move op process logic from executor to base op

* fix typo

* move export outside op

* init refactor

* update analyser

* fix format

* clean up

* bring back batch mapper

* Improve fault tolerance & Fix Ray executor

* fix wrapper

* fix batched filter

* Remove use_actor as it is not compatible with the refactored OP clas, unless the dataset class is refactored

* make wrappers work with unittests

* Compatible with unit tests and works with ray

* fix unittest

* fix wrappers with ray, map, filter

* unify unittests

* wrap deduplicators

* Compatible with non-batched calls

* Class-level wrappers

- compatible with dataset.filter
- bring back nested wrappers

* Instance-level wrappers

* Refined instance-level wrappers

- Remove incomplete dataset.filter wrappers
- Simplify code
- Stack wrappers

* fix use_cuda

* Refactor dataset (#348)

* refactor dataset

* update unittest with DJDataset

* fix unittest

* update ray data load

* add test

* ray read json

* update docker image version

* actor is no longer supported

* Regress filter's stats export logic

---------

Co-authored-by: BeachWang <[email protected]>
Co-authored-by: Xuchen Pan <[email protected]>
Co-authored-by: chenhesen <[email protected]>
Co-authored-by: garyzhang99 <[email protected]>

* minor fix

* fix num_proc default None

---------

Co-authored-by: Ce Ge (戈策) <[email protected]>
Co-authored-by: BeachWang <[email protected]>
Co-authored-by: Xuchen Pan <[email protected]>
Co-authored-by: chenhesen <[email protected]>
Co-authored-by: garyzhang99 <[email protected]>
Co-authored-by: null <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants